Semantic and syntactic information for neural machine translation

نویسندگان

چکیده

Abstract Introducing factors such as linguistic features has long been proposed in machine translation to improve the quality of translations. More recently, factored proven still be useful case sequence-to-sequence systems. In this work, we investigate whether gains hold state-of-the-art architecture neural translation, Transformer, instead recurrent architectures. We propose a new model, Factored introduce an arbitrary number word source sequence attentional system. Specifically, suggest two variants depending on level at which are injected. Moreover, combination mechanisms for and words themselves. experiment both with classical semantic extracted from linked data database, low-resource datasets. With best-found configuration, show improvements 0.8 BLEU over baseline Transformer IWSLT German-to-English task. more challenging FLoRes English-to-Nepali benchmark, includes very distant languages, obtain improvement 1.2 BLEU. These achieved not information.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combining Semantic and Syntactic Generalization in Example-Based Machine Translation

In this paper, we report our experiments in combining two EBMT systems that rely on generalized templates, Marclator and CMU-EBMT, on an English–German translation task. Our goal was to see whether a statistically significant improvement could be achieved over the individual performances of these two systems. We observed that this was not the case. However, our system consistently outperformed ...

متن کامل

Statistical Machine Translation of English – Manipuri using Morpho-syntactic and Semantic Information

English-Manipuri language pair is one of the rarely investigated with restricted bilingual resources. The development of a factored Statistical Machine Translation (SMT) system between English as source and Manipuri, a morphologically rich language as target is reported. The role of the suffixes and dependency relations on the source side and case markers on the target side are identified as im...

متن کامل

Transferring Semantic Roles Using Translation and Syntactic Information

Our paper addresses the problem of annotation projection for semantic role labeling for resource-poor languages using supervised annotations from a resource-rich language through parallel data. We propose a transfer method that employs information from source and target syntactic dependencies as well as word alignment density to improve the quality of an iterative bootstrapping method. Our expe...

متن کامل

Detecting Cross-Lingual Semantic Divergence for Neural Machine Translation

Parallel corpora are often not as parallel as one might assume: non-literal translations and noisy translations abound, even in curated corpora routinely used for training and evaluation. We use a cross-lingual textual entailment system to distinguish sentence pairs that are parallel in meaning from those that are not, and show that filtering out divergent examples from training improves transl...

متن کامل

Pivot-Based Semantic Splicing for Neural Machine Translation

Current neural machine translation (NMT) usually extracts a fixedlength semantic representation for source sentence, and then depends on this representation to generate corresponding target translation. In this paper, we proposed a pivot-based semantic splicing model (PBSSM) to obtain a semantic representation including more translation information for source sentence, thus improving the transl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Machine Translation

سال: 2021

ISSN: ['0922-6567', '1573-0573']

DOI: https://doi.org/10.1007/s10590-021-09264-2